On Evaluation of Ensemble Forecasts Calibration Using the Concept of Data Depth

نویسندگان

  • MAHSA MIRZARGAR
  • JEFFREY L. ANDERSON
چکیده

Various generalizations of the univariate rank histogram have been proposed to inspect the reliability of an ensemble forecast or analysis in multidimensional spaces. Multivariate rank histograms provide insightful information about the misspecification of genuinely multivariate features such as the correlation between various variables in a multivariate ensemble. However, the interpretation of patterns in a multivariate rank histogram should be handled with care. The purpose of this paper is to focus on multivariate rank histograms designed based on the concept of data depth and outline some important considerations that should be accounted for when using such multivariate rank histograms. In order to generate correct multivariate rank histograms using the concept of data depth, the datatype of the ensemble should be taken into account to define a proper pre-ranking function. We demonstrate how and why some pre-ranking functions might not be suitable for multivariate or vector-valued ensembles and propose pre-ranking functions based on the concept of simplicial depth that are applicable to both multivariate points and vector-valued ensembles. In addition, there exists an inherent identifiability issue associated with center-outward pre-ranking functions used to generate multivariate rank histograms. This problem can be alleviated by complementing the multivariate rank histogram with other well-known multivariate statistical inference tools based on rank statistics such as the depth-versus-depth (DD) plot. Using a synthetic example, we show that the DD-plot is less sensitive to sample size compared to multivariate rank histograms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Ensemble Approach for Anomaly Detection in Wireless Sensor Networks Using Time-overlapped Sliding Windows

One of the most important issues concerning the sensor data in the Wireless Sensor Networks (WSNs) is the unexpected data which are acquired from the sensors. Today, there are numerous approaches for detecting anomalies in the WSNs, most of which are based on machine learning methods. In this research, we present a heuristic method based on the concept of “ensemble of classifiers” of data minin...

متن کامل

Ensemble Forecasts Using Rank Histograms

4 Any decision making process that relies on a probabilistic forecast of future events necessarily 5 requires a calibrated forecast. This paper proposes new methods for empirically assessing 6 forecast calibration in a multivariate setting where the probabilistic forecast is given by an 7 ensemble of equally probable forecast scenarios. Multivariate properties are mapped to a single 8 dimension...

متن کامل

Extending extended logistic regression to effectively utilize the ensemble spread

To achieve well calibrated probabilistic forecasts, ensemble forecasts often need to be statistically post-processed. One recent ensemble-calibration method is extended logistic regression which extends the popular logistic regression to yield full probability distribution forecasts. Although the purpose of this method is to post-process ensemble forecasts, mostly only the ensemble mean is used...

متن کامل

Investigation on the Climatic Parameters Fluctuation Using Data from the The European Centre for Medium-Range Weather Forecasts (Case study: Shirkouh Region - Yazd Province)

Any changes in the climate system affect on the access and management of natural resources such as water and soil. Temperature and precipitation are the key elements of climate for studying their trend can be important for atmospheric scientists, environmental managers and planners in the field of hydrology, agriculture, environment and so on. In this study, the trend of climate fluctuations wa...

متن کامل

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017